In Next.js App Router, Route Handlers can stream responses using the Web Streams API via ReadableStream or TransformStream. This allows data to be sent to the client progressively — chunk by chunk — instead of waiting for the entire response to be ready.
Streaming is useful when you have long-running operations like AI text generation, large data exports, or real-time logs. Instead of the client waiting for the full response, it starts receiving and rendering data immediately as chunks arrive.
AI/LLM responses — stream tokens as they are generated (e.g. ChatGPT-style)
Large data exports — stream CSV/JSON rows without loading all data into memory
Real-time logs — stream server logs or job progress to the client
Long-running computations — send partial results as they become available
Server-Sent Events (SSE) — push updates from server to client over HTTP
Always set 'Cache-Control: no-cache' for streaming responses to prevent buffering
Use 'Transfer-Encoding: chunked' for plain streams and 'text/event-stream' for SSE
Always call controller.close() when done — leaving streams open causes memory leaks
Edge Runtime supports streaming natively and has lower latency than Node.js runtime
Node.js runtime may buffer small chunks — use flush() or increase chunk size if needed
ReadableStream is a Web API — available in both Edge and Node.js runtimes in Next.js 13+
For AI streaming, Vercel AI SDK abstracts all of this with useChat and useCompletion hooks
ReadableStream — one-time large response, file downloads, AI completions
SSE (text/event-stream) — server pushing multiple updates over time, live feeds, notifications
WebSocket — full duplex, both client and server send messages, chat apps, collaboration tools